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Abstract 

Small single-domain proteins often exhibit only a single free-energy bar- 
rier, or transition state, between the denatured and the native state. The 
^ folding kinetics of these proteins is usually explored via mutational analysis. 

A central question is which structural information on the transition state 
can be derived from the mutational data. In this article, we model and 
structurally interpret mutational $-values for two small /?-sheet proteins, 
1^ the PIN and the FBP WW domain. The native structure of these WW do- 

CT mains comprises two /?-hairpins that form a three-stranded /?-sheet. In our 

model, we assume that the transition state consists of two conformations 
in which either one of the hairpins is formed. Such a transition state has 
been recently observed in Molecular Dynamics folding-unfolding simulations 
\/-^ of a small designed three-stranded /?-sheet protein. We obtain good agree- 

ment with the experimental data (i) by splitting up the mutation-induced 
free-energy changes into terms for the two hairpins and for the small hy- 
drophobic core of the proteins, and (ii) by fitting a single parameter, the 
^ relative degree to which hairpin 1 and 2 are formed in the transition state. 

The model helps to understand how mutations affect the folding kinetics of 
J> WW domains, and captures also negative ^-values that have been difficult 

to interpret. 



Introduction 



How proteins fold into their native 3-dimensional structure remains an intrigu- 
ing question. Given the vast number of unfolded protein conformations, Cyrus 



1 



Levinthal argued in 1968 [1,2] that proteins are guided to their native structure 
by a sequence of folding intermediates. In the following decades, experimentalists 
focused on detecting and characterizing metastable intermediates with a variety 
of methods [3]. While such folding intermediates continue to be of considerable 
interest [4,5], the view that proteins have to fold in sequential pathways from inter- 
mediate to intermediate, now known as 'old view' [6,7], changed in the '90s when 
statistical-mechanical models demonstrated that fast and efficient folding can also 
be achieved on funnel energy landscapes that are smoothly biased towards the 
native state and do not exhibit metastable intermediates [8,9]. The paradigmatic 
proteins of this 'new view' are two-state proteins, first discovered in 1991 [10]. 
Two-state proteins fold from the denatured state to the native state without ex- 
perimentally detectable intermediate states. Since then, many small-single domain 
proteins have been shown to fold in two-state kinetics [11-13]. 

The folding dynamics of two-state proteins is thought to be dominated by a single 
free-energy barrier, or transition state, between the denatured and native state. 
This transition state of the protein folding reaction is an instable, short-hved state 
and cannot be observed directly. Instead, the dynamics of two-state proteins is 
often explored via mutational analysis [14-33] . In such an analysis, a large number 
of mostly single-residue mutants of a protein is generated. For each mutant, the 
effect of the mutation on the folding dynamics is usually quantified by its $- 
value [12,34] 

_ RTln{ky^/kmnt) .. . 

Here, fcwt is the folding rate for the wildtype protein, /cmut is the folding rate for 
the mutant protein, and AGn is the change of the protein stability induced by 
the mutation. The stability G^v of a protein is the free energy difference between 
the denatured state D and the native state N. In classical transition-state theory, 
the folding rate of a two-state protein is proportional to exp[— Gy/i^T], where Gt 
is the free energy difference from the denatured state to the transition state. It is 
usually assumed that the prefactor of this proportionality relation does not depend 
on the mutation. In this notation, $-values have the form 

AGt 

where AGt is the mutation-induced change of the free-energy barrier Gt- 

The mutational $-value data for a protein provide indirect information on its 
folding dynamics and, therefore, have attracted considerable theoretical interest. 
The central question is: Which transition transition-state structures and free- 
energy perturbations are consistent with the experimentally measured values? 
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In this article, we model $-values from detailed mutational analyses [25,29,33] of 
two small /S-sheet proteins, the FBP and PIN WW domains. The native structure 
of these proteins consists of two hairpins forming a three-stranded sheet [35,36] (see 
Fig. 1). The design principles [37,38] and folding kinetics [25,29,33,39-47] of WW 
domains and other three-stranded /3-sheet proteins have been studied extensively. 
Because of their small size and abundance as protein domains, WW domains are 
important model systems for understanding /3-sheet folding and stability. 

Molecular Dynamics (MD) simulations with atomistic models are computation- 
ally demanding and in general do not allow direct calculations of folding rates and 
$-values. With additional assumptions on transition states or $-values, transition- 
state conformations have been extracted from MD unfolding trajectories at ele- 
vated temperatures [48-50] or constructed from MD simulations that use $-values 
as restraints [51,52]|^ However, for a small, designed three-stranded /5-sheet pro- 
tein, beta3s, transition-state conformations [60] and "I'-values [61] have been more 
rigorously determined from extensive equilibrium folding-unfolding MD simula- 
tions. The native structure of beta3s is similar to the structure of WW domains, 
with two /3-haipins forming an antiparallel three-stranded /9-sheet. Rao et al. [60] 
performed four MD simulations of beta3s at the temperature 330 K with a total 
length 12.6 /is, and observed 72 folding and 73 unfolding events. By identifying 
clusters of structurally similar conformations that have the probability pfoid = 0.5 
to fold [62-64], and the same probability to unfold, Rao et al. obtained a transition- 
state ensemble for beta3s that is "characterized by the presence of one of the two 
native hairpins formed while the rest of the peptide is mainly unstructured" [60]. 
The two /5-hairpins of beta3s thus appear to be cooperative substructures that are 
either fully structured or unstructured in the transition state. 

Here, we show that a statistical-mechanical model with a beta3s-like transition- 
state ensemble in which either hairpin 1 or hairpin 2 are formed leads to an overall 
consistent interpretation of experimental $-values for the FBP and PIN WW do- 
mains. In this model, mutations can either affect hairpin 1, hairpin 2, or the small 
hydrophobic core of the WW domains, which is not yet structured in the transition 
state. The general form of $-values in this model is 



where xi is the probability, or fraction, of the transition-state conformation in 
which hairpin 1 is formed, and X2 = 1 — Xi is the probability of the transition- 

^In statistical-mechanical or Go-type models with simplied energy landscapes, in contrast, 
folding rates and stabilities for wildtype and mutants and can be easily calculated [53-59] . How- 
ever, the lack of atomistic detail in these models appears to make it difficult to reproduce detailed 
mutational data. 



AGt 



XiAGi + X2AG2 
AGn 



(3) 
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state conformation with hairpin 2 formed. The mutation-induced changes of the 
free energy difference between the two transition-state conformations and the de- 
natured state are denoted by AGi and AG2. The model has just two structural 
parameters, xi ^ind X2, which are obtained from a comparison with the experi- 
mental data. Different $-values for different mutations simply arise from different 
'free-energy signatures' AGi, AG2, and AG^ of the mutations. 

In particular, the model reproduces the negative $-value for the mutation L36A 
of the FBP WW domain. The mutation destabilizes the native state {AG^ > 0), 
but stabilizes hairpin 2 (AG2 < 0), according to calculations with the empirical 
force field FOLD-X [65,66]. This leads to a negative $- value in eq. ([s]) since AGi 
equals for this mutation. In general, 'nonclassical' $-values, i.e. $-values that 
are negative or larger than 1, are obtained in the model if mutations stabilize some 
structural elements, but destabilize others. The mutation L36A of the FBP WW 
domain, for example, stabilizes hairpin 2, but destabilizes the hydrophobic core. 

Nonclassical $-values have been difficult to interpret in the traditional interpreta- 
tion. In this interpretation, a $-value is taken to indicate the degree of structure 
formation of the mutated residue in the transition-state ensemble T [12]. A $- 
value of 1 is interpreted to indicate that the residue has a native-like structure in 
T, since the mutation shifts the free energy of the transition state T by the same 
amount as the free energy of the native state N. A $-value of is interpreted to 
indicate that the residue is as unstructured in T as in the denatured state D, since 
the mutation does not shift the free-energy difference between these two states. $- 
values between and 1 are typically taken to indicate partial native-like structure 
in T. For a protein with M residues, the traditional interpretation thus implies 
M structural parameters, the degrees of structure formation of all residues. In 
contrast, the model presented here has just a single independent parameter, the 
relative degree to which hairpin 1 and 2 are populated in T. Since degrees of struc- 
ture formation have to be between ('denatured-like') and 1 ('native-like'), the 
traditional interpretation can not explain nonclassical $-values smaller than or 
larger than 1. In the present model, nonclassical $- values arise from substructural 
free-changes contributions of different sign (see above). 

We have recently suggested a related, novel model for $-values of mutations in 
a-helices of a protein [67, 68] . The model is based on cooperative helix formation 
and on splitting mutation-induced free energy changes in helices into secondary 
and tertiary terms [68]. The two structural model parameters are the degrees of 
secondary and tertiary structure formation of the helix in the transition state. For 
several well-characterized helices [68], fitting these two parameters to mutational 
data leads to a consistent, structural interpretation of the $-values. The general 
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conclusion from our helix model and the model for small /3-sheet proteins presented 
here is that a consistent structural interpretation of $-values (i) requires to split 
up mutation-induced stability changes into free-energy contributions from different 
substructural elements of a protein, and (ii) can be obtained with few parameters 
that characterize the degree of structure formation of cooperative elements such 
as a-helices and /9-hairpins in the transition-state ensemble. 



Model 



The central assumption of our model is that each of the hairpins is either fully 
formed or not formed in the transition-state ensemble of the protein. The model 
has then four states: the denatured state D in which none of the hairpins is formed, 
a transition-state conformation in which only hairpin 1 is formed, a transition-state 
conformation in which only hairpin 2 is formed, and the native state with both 
hairpins formed. The energy landscape can be characterized by three free-energy 
differences: The free-energy difference Gjv of the native state and the free-energy 
differences Gi and G2 of the transition-state conformations with respect to the 
denatured state (see Fig. 2). 

The folding kinetics is described by the master equation 

^^f^ = K-^-(i) - WmnPn{t)] , (4) 

which gives the time evolution of the probability that the protein is in state 
n at time t. Here, Wnm is the transition rate from state m to n, defined by 

«;nm= ^(l + e^"-^-)"' (5) 
to 

provided the states n and m are connected via a single step in which only a single 
hairpin folds or unfolds [69]. For other transitions, i.e. for the direct transition 
from the denatured state to the native state, and vice versa, the transition rates 
are zero. Here, to is a reference time scale. The transition rates defined above obey 
detailed balance WnmPm — '^mnPn where ~ exp[— G„/(i?T)] is the equilibrium 
weight for the state n. Detailed balance ensures that the system ultimately reaches 
thermal equilibrium. 

The master equation of this four-state model can be solved exactly (see Appendix). 
For high transition-state barriers Gi ^ RT and G2 ^ RT and a stable native 
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state with Gat <^ —RT, the folding rate is given by 



in units of 1/to- The folding rate k simply is the sum of the rates for the two 
possible folding routes on which either hairpin 1 or hairpin 2 forms first. The 
factor I in the equation above arises because a molecule, after reaching one of 
the barrier states 1 or 2, either proceeds to N or returns to D, with almost equal 
probability. 

Mutations correspond to the perturbations of the free energy landscape. A mu- 
tation therefore can be characterized by the free energy changes AGi, AG2, and 
AGjy. The folding rate of the mutant then is fcmut = k{Gi + AGi,G2 + AG'2). 
For small perturbations AGi and AG'2; a Taylor expansion of In k^t = In A; to first 
order leads to 

In k^^t - In /Cwt ^ ^^^^1 + ^^^^2 = iXiAGi + X2 AG2) (7) 

with 

g-Gi/i?,T ^-G-z/RT 
= __n, /RT , "ZnZJwT ^nd X2 = 



Q-Gi/RT ^-G2/RT ~ g-Gi/flT _^ ^-Gz/RT ^"^^ 

The two parameters Xi ^ind X2 quantify the extent to which the transition-state 
conformation 1 and the transition-state conformation 2 are populated in the transition- 
state ensemble. From the $- value definition ([T]) and eq. ([T]), we obtain the general 
form of $-values given in eq. (|3|. 



Results 



FBP WW domain 



We first consider the FBP WW domain. Petrovich et al. [33] have performed an 
extensive mutational analysis of the folding kinetics. The $-values and stability 
changes AGn for the considered mutations are summarized in Table 1, together 
with an assessment which structural elements are affected by the mutations. This 
assessment is based on the contact matrix of the FBP WW domain shown in Fig. 3. 
A black dot at position of this matrix indicates that the two amino acids i and 
j are in contact, i.e. that the distance between any of their non-hydrogen atoms is 
smaller than the cutoff distance 4 A. Since the contact matrix is symmetric, only 
one half is represented in Fig. 3. The two contact clusters in the matrix correspond 
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to hairpin 1 and hairpin 2 of the FBP WW domain. The remaining contacts largely 
correspond to contacts of hydrophobic amino acids, the small 'hydrophobic core' 
of the protein. About half of the mutations performed by Petrovich et al. affect 
only either hairpin 1 or hairpin 2. The mutation ETA of amino acid 7, for example, 
affects the contacts (7,22), (7,23), and (7,24), which are all located in hairpin 1 
(see contact map in Fig. 3). The remaining mutations also affect the hydrophobic 
core, or both hairpins. The mutation Y21A, for example affects the contacts (8, 21) 
and (9,21) in hairpin 1, and the contacts (21,26), (21,27), and (21,28) in hairpin 
2. 

To test our model, we first consider all mutations that affect only one of the 
hairpins. The model predicts that all mutations that affect only hairpin 1 should 
have the same $- value xii ^ind all mutations that affect only hairpin 2 the same 
$- value X2- This is a direct consequence of eq. ([3]). For mutations that affect 
only hairpin 1, for example, we have AG2 = since the mutations don't shift the 
stability of hairpin 2, and AGat = AGi since they also don't affect the hydrophobic 
core. Eq. ^ then results in $ = xi for these mutations. The $-values for the 
ten mutations that only affect hairpin 1 are plotted in Fig. 4. Except for one clear 
outlierj^all $-values are centered around the value 0.8, mostly within experimental 
errors. The mean value of these nine $-values (dashed line in Fig. 4) leads to the 
estimated xi = 0-81 =t 0.06. The error here is estimated as error of the sample 
mean. The standard deviation of the values from the the mean value is 0.18. 
The four $-values for mutations that affect only hairpin 2 range from 0.08 to 
0.39 (see Table 1), with mean value X2 = 0.30 ± 0.08 and standard deviation 

0. 16. For both sets of mutations, we thus obtain good agreement with the model. 
In addition, the sum of the above estimated values for the model parameters xi 
and X2 is close to 1, within the error bounds, which is an additional consistency 
requirement of the model. The two parameters Xi ^"^^ Xi are the fractions to 
which the two transition-state conformations with either hairpin 1 or hairpin 2 
formed are populated. These fractions sum up to 1 since the protein has to take 
one of the possible routes in the model. 

To include other mutations in the model, we have to estimate the impact of these 
mutations on the stability of the different structural elements they affect (hairpin 

1, hairpin 2, or the hydrophobic core). We use FOLD-X here, a molecular model- 
ing program for the prediction of mutation-induced stability changes [65,66]. The 

^The data point for the mutation T9A can be confirmed as outlier, e.g., with the Grubb's 
test [70] at the standard significance level of 5 %. For a set of 10 data points as here, a value 
of X is an outlier for z = {x — a;)/SD > 2.29 where x is the sample mean, and SD the standard 
deviation. For the mutation T9A with (f>-value —0.09, the z-value 2.43 exceeds the critical value 
2.29. 
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FOLD-X force field includes terms for backbone and sidechain entropies, which 
have been weighted against other terms using experimental data from mutational 
stability analyses. FOLD-X has been tested on a set of 1088 point mutants and 
reproduces the stability changes of 1030 of these mutants with a correlation co- 
efficient of 0.83 and a standard deviation of 0.81 kcal/mol [65]. With FOLD-X, 
we calculate the mutation-induced stability changes AG^- for the whole FBP WW 
domain, and the stability changes AGi and AG2 of hairpin 1 and 2, depending 
on whether the mutation affects these hairpins. To calculate AGi and AG2, we 
simply 'cut out' these hairpins from the PDB structure and estimate the stability 
of the wildtype and mutant hairpins with FOLD-X (see caption of Table 2 for 
details). The resulting data are summarized in Table 2. The calculated stability 
changes AGn can be directly compared to the experimentally measured stability 
changes AGjv^exp- We include here only mutations in the model for which the 
FOLD-X predicted stability changes AGn do not differ by more than a factor 2 
from the experimental stability changes AGAr^exp- For other mutations, the force- 
field calculations are unreliable. In Table 2, the calculated stability changes for 
these mutations are shown in brackets. 

The mutations in Table 2 affect two of the structural elements: The mutations 
W8F and T13A affect hairpin 1 and the hydrophobic core. For these mutations, 
we have AG2 = 0, and $ = xiAGi/ AGjy according to Eq. The mutation 
Y21A affects both hairpins, hence $ = (xiAGi + X2AG2) /{AG1+AG2). Finally, 
the mutations T29G, W30A, and L36V affect hairpin 2 and the hydrophobic core. 
Therefore, we have AGi = for these mutations, and $ = X2AG2/ AG^- 

Let us now consider the set of 20 mutations that consists of these 6 mutations 
that affect two structural elements and the 14 mutations that affect either only 
hairpin 1 or only hairpin 2. Our model has two parameters, xi and X2- However, 
since Xi + X2 = 1, there is only one independent parameter. We determine this 
parameter from a least-square fit between the theoretical value formula given in 
eq. [3] and the experimental <I>-values and obtain the values Xi = 0.77 ± 0.05 and 
X2 = 0.23 ± 0.05, see Fig. 5. 

PIN WW domain 

Mutational analyses of the PIN WW domain's folding kinetics have been performed 
by Jager et al. [25] and Deechongkit et al. [29] . While Jager et al. have considered 
standard single-site amino-acid replacements, Deechongkit et al. synthesized amid- 
to-ester mutants that specifically perturb backbone H-bonds. The experimental $- 
values and stability changes AGAr^exp for these mutations are summarized in Table 
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3. The synthetic amino acids in the mutations of Deechongkit et al. are denoted by 
lowercase greek letters (last six hnes in Table 3). Since these mutations perturb the 
backbone H-bonds, they only affect either hairpin 1 or hairpin 2, which is indicated 
in the last column in Table 3. For the mutations considered by Jager et al., the 
affected structural elements are again assessed based on the contact map shown 
in Fig. 3. We consider here only mutations with stability changes AGjv,cxp > 0.8 
kcal/mol. $- values of mutations that cause significantly smaller stability changes 
are often considered as unrehable [30,71,72] (see also Discussion). 

Seven mutations in Table 3 affect only hairpin 1 of the PIN WW domain. The mean 
value of the $-values for these mutations leads to the estimate xi = 0.69 ± 0.05. 
The standard deviation of the ^-values from the mean is 0.12, which is comparable 
to the experimental errors. The four ^-values of the mutations that affect only 
hairpin 2 have the mean value X2 = 0.36 ±0.05 and the standard deviation 0.10. In 
agreement with our model, these estimates for Xi and X2 again add up to 1, within 
the statistical errors. In an alternative approach, the values of xi and X2 can be 
obtained from a least-square fit between theoretical and experimental ^-values (see 
Fig. 6). From the fit, we obtain xi = 0.67 ± 0.05 and ^2 = 1 - Xi = 0.33 ± 0.05, 
and a Pearson correlation coefficient of 0.85 between theoretical and experimental 
values. 

We do not include mutations that affect more than one structural element here 
since the stability changes estimated with FOLD-X appear to be unreliable. For 
four of the five mutants, the calculated stability changes AG^ differ by significantly 
more a factor 2 from experimental values AGN,exp (data not shown). The stabilities 
for the PIN WW domain mutants may be more difficult to calculate since they 
involve a larger range of amino acids, compared to the FBP WW mutants that 
mostly involve changes to the small amino acids Alanine or Glycine, which can be 
modeled via simple truncation of sidechains prior to the FOLD-X calculations. 



Discussion and Conclusions 

We have modeled ^-values from extensive mutational analyses of two WW domains 
based on the central assumption that the transition state ensemble of these proteins 
consists of two substates in which either hairpin 1 or hairpin 2 are formed. The 
structural information obtained from the mutational data by fitting a single model 
parameter is that the transition state ensemble of the FBP WW domains consists 
to roughly | of substate 1 with hairpin 1 formed, and to | of substate 2 with 
hairpin 2 formed. The transitions state ensemble of the PIN WW domain consists 
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to roughly | of substate 1, and to | of substate 2, according to the model. 

In the model, the magnitude of a $-value depends on which structural elements are 
affected, and on the mutation-induced free energy changes of these elements. The 
mutation ETA of the FBP WW domain, for example, has a relatively large value 
since this mutation only affects hairpin 1, which is structured in the dominant 
substate 1 of the transition state ensemble, whereas the mutation W8F has a 
relatively small $-value since the mutation mainly affects the free energy of the 
small hydrophobic core, which is not yet formed in the transition state. The 
model also reproduces the negative $-value of the mutation L36A, which results 
from different signs of the mutations-induced free energy changes AGi and AGn 
in Table 2. According to the free-energy calculations with FOLD-X, the mutation 
stabilizes hairpin 1 (AGi < 0), but has an overall destabilizing effect {AG^ > 0) 
since it destabilizes the hydrophobic core. 

The deviations between experimental and theoretical $-values are within reason- 
able errors. It has been recently suggested that experimental errors for $-values 
may be underestimated since it is usually assumed that the errors in the measured 
free energy changes of the transition state and the folded state are independent, 
which is not the case [73]. In case of the PIN WW domain, we have only considered 
mutations with stability changes AG^ > 0.8 kcal/mol. For mutations that induce 
significantly smaller stability changes, experimental errors in AGn may lead to 
large errors in $-values since AGn constitutes the denominator of the $- value 
defined in eq. ([T]). 

However, the large values up to 1.8 for three mutations with small stability 
changes in the loop of hairpin 1 of the PIN WW domain [25], which have not 
been considered here, may also result from structural rearrangements. Jager et 
al. [25] have suggested a five-state model with two consecutive transition states. 
In the first transition state, only the loop of hairpin 1 is formed. Nonclassical 
$- values greater than 1 are obtained in this model for mutations that are assumed 
to shift the free energy of the loop by a larger amount than the free energy of 
the native state. With the same assumption, large nonclassical $-values in the 
loop of hairpin 1 are also obtained in the four-state model presented here. For 
Xi = 0.67, for example, a $-value of 1.8 is obtained for a mutation in this loop 
with AGi = 2.7AGn, according to eq. (|3]), since hairpin 2 and, thus, AG2 are 
not affected by this mutation. Such a situation may result from a structural 
rearrangement between the transition-state conformation with hairpin 1 formed 
and the native state. The structural rearrangement may affect the sidechains in 
the loop, but should not affect the backbone hydrogen bonds since the $-values 
for the amide-to-ester mutations SI60", R17p, and S19cr in this loop are between 



10 



0.70 and 0.83 (see Table 3). Within the experimental and statistical errors, these 
$-values are close to Xi = 0.67, which is the expected $-value for mutations with 



Appendix: Exact solution of the master equation 

The master equation Q can be written in the matrix form: 

dP(t) 



dt 



WP{t) (9) 



The elements of the vector P{t) are the probabilities P„(t) that the protein is in 
state n at time t, and the matrix elements of W are given by 

Wnm = -Wnm for U Ui; Wnn = ^ ^mn- (10) 

For the model with four states considered here, the matrix W is given by 

/ l+e9i^l+e92 1+e-si l+e-92 \ 



tn 



11,1 ' 



1 1 1 ' 



l+e92 ^ l+e"92 ^l+e9JV"92 l-|_e92-SJV 

V ^ ^ ^ + ^ / 

\ i+e9N-9i i4_e9]v-92 i+e9i^9JV 14-e92-9jv/ 



l + e9iV-91 l-|_e9]V-92 l+e91^9JV ' l-(_e92-9JV, 

To simplify the notation, we have used here dimensionless free-energy differences 
Qi = Gi/RT {i = 1, 2, or A^) of the partially folded states 1 and 2 and the native 
state with respect to the denatured state. 

The general solution P{t) of the master equation can be expressed in terms of the 
eigenvalues A and eigenvectors Yx of the matrix W: 

P{t) = J2cxYxexp[-\t] (11) 

A 

The prefactors cx in this general solution depend on the initial conditions at time 
t = 0. For the 4x4 matrix above, the 4 eigenvalues are given by A = 0, 1 — g, 
I + q, and 2, in units of I /to, with 

1 _ pQN -91-92 

q= , (12) 
+ e-fi)(l + e-92)(i + ef^'-9i)(i + e^'^-sa) 

Since we have — 1 < g < 1, the three nonzero eigenvalues are positive and describe 



the relaxation to the equilibrium state of the model (see eq. ( 11 )). The equilibrium 



state simply is CqYo where Yo is the eigenvector with eigenvalue 0. 
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This model exhibits two-state folding kinetics under two conditions. First, the 
native state has to be stable, i.e. the free energy gj^ of the native state must be 
significantly smaller than the free energies of the other three states. Second, the 
free energy differences gi and g2 between the intermediate states and the denatured 
have be to significantly larger than RT. The partially folded states then constitute 
the transition-state ensemble. Under these two conditions, the three Boltzmann 



weights e^^ e^^ and e^^ in eq. (12) are much smaller than 1, and also 



much smaller than e and e which leads to 

q ^ ^ (13) 

^{l + e~9i)(l + e-a2) 

For large barrier energies gi and g2, we have e~^^ ^ 1 and e~^^ -C 1, and therefore 
(1 + e~^^)(l + e~^2) ~ (1 + e~^^ + e~^^). If we now use the expansion (1 + ^ 
1 — a;/2 with x = e~^^ + e~^^ <^ 1, the smallest nonzero relaxation rate, or folding 
rate, k = 1 — q is given by eq. (joj), i.e. by ~ ^ (e~^^ + e~^^) in the notation used 
in this appendix. The folding rate k is much smaller than the other two relaxation 
rates 1 + g and 2, which corresponds to an initial 'burst phase'. 
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Table 1: Mutational data for the FBP WW domain 



lllLlLdjllUli 


c5 

^cxp 




affected 

1 1 > 1 OTl T C 

LlLlllLlito 


"P7 A 
III 1 1\ 


n R7 -1-0 91 
U.O ( nz KJ.ZL 


U.OZ ZC U. iO 


hp 1 


wop 
vv or 


n 9/1 -1- n IT? 


i.uo zc u. iu 


np 1, nc 


TQ A 


n HQ 4- n H/i 
— u.uy zt U.U4 


u.yo zt u.uy 


np 1 


TOP 


n Q/L -1- n 9n 

(J.c)^ Zt U.ZU 


n c:n 4- n 1 n 
u.ou zt u.iu 


np 1 


V1 1 A 


U.OO IC U. iU 


n fiQ 4- n 1 1 

U.OO ZC U. i i 


np i 


T1 A 


— u.uo m u.u ( 


81 4-017 
U.oi zC U. i ( 


npi, nc 




n Q9 -1- n 9c^ 
— U.OZ IC u.zo 


n c;8 4- n 99 
u.oo zc u.zz 


np 1, nc 


A 1 /IP 


n KQ -I- n 98 

u.oy zt u.zo 


n i^n 4- n 99 
u.ou zt u.zz 


np 1 




n 89 -I- n 1 R 

U.oZ zt U.iD 


0/194-0 HQ 

u.'iZ zt u.uy 


np 1 




n 77 -1- n 1 7 

U. ( ( IC U. i ( 


n QQ 4- n DQ 

u.oy zc u.uy 


np 1 


PI A 


1 1 7 -1- n 99 
i . i ( IC u.zz 


1 QQ 4- n 97 
i.oo zc u.z ( 


hp 1 


Tl 8 A 


n QQ -I- n 97 
u.yo m u.z ( 


n c^/i 4- n 1 7 

U.04i zt U. i ( 


ViT-i 1 
np 1 


Tl fiP 


u. / o zt u.uo 


11/14-0 HQ 

1.14 Zt U.uy 


np 1 


VI OA 


nil -1- n 
u.ii zt u.uo 


n R7 4- n 1 Q 

U.O / zt U.IO 


hp 1, hp 


1 zur 


u.uo ZC U. iO 


n fi8 4- n 1 8 
U.Oo zc u. io 


np 1, np 


V91 A 


n 98 -1- n (19 
U.zo ZL u.uz 


1 7n -I- n 1 n 

1. ( u ZL U.IU 


np 1 J np 


R24A 


0.29 ±0.09 


0.78 ±0.17 


hp 1, hp 


T25A 


0.39 ± 0.04 


2.51 ±0.18 


hp 2 


T25S 


0.27 ±0.03 


1.08 ±0.09 


hp 2 


L26A 


0.08 ±0.08 


0.56 ±0.12 


hp 2 


L26G 


0.45 ± 0.04 


-1.29 ±0.10 


hp 2 


E27A 


0.12 ±0.04 


1.02 ±0.13 


hp 2, he 


T29G 


0.09 ± 0.02 


1.89 ±0.11 


hp 2, he 


W30A 


0.19 ±0.06 


0.76 ±0.14 


hp 2, he 


L36A 


-0.30 ±0.16 


0.91 ±0.14 


hp 2, he 


L36V 


-0.13 ±0.09 


0.53 ±0.14 


hp 2, he 



Experimental values and stability changes AGn^sxp are from Petrovich et al. [33]. 
The information on the structural elements affected by the mutations is derived 
from the contact map shown in Fig. 3. These structural elements are the hairpin 
1 (hp 1), hairpin 2 (hp 2), and the small hydrophobic core (he) of the protein (see 
text). 



Table 2: Experimental and calculated stability changes for mutations of the FBP 
WW domain that affect several structural elements 
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Experimental data for the stability changes AGjv.cxp are from Petrovich et al. [33]. 
The stability changes AG^, AGi, and AG2 for the whole protein and hairpin 1 
or 2, respectively, have been calculated with the program FOLD-X [65,66]. For 
mutations to alanine (A) or glycine (G) and the muation W8F, native structures 
for the mutant proteins have been generated by truncation of atoms. For the 
mutations Y20F and L36V, mutant structures were generated with the program 
WHAT IF [74] . The wildtype structure used in the calculations is model 1 of the 
PDB structure lEOL [35]. To calculate AGi and AG2, substructures consisting of 
the residues 1 to 24 and 15 to 37 of the PDB structure have been used. The FOLD- 
X calculations have been performed at the ionic strength 150 mM and temperature 
283 K of the experiments [33]. Numbers in brackets indicate that the calculated 
stability changes are not reliable since AGn differs by more than a factor 2 from 
AGm ,exp- 
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Table 3: Mutational data for the PIN WW domain 
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Experimental ^-values and stabihty changes AGjv,exp for the mutations L7A to 
W34A are from Jager et al. [25], and for the amid-to-ester mutants K13/t to S32(T 
from Deechongkit et al. [29]. Here, only mutations with stability change AG^v.cxp > 
0.8 kcal/mol are considered. The structural elements affected by the mutations 
are assessed from the contact map shown in Fig. 3. These structural elements are 
the hairpin 1 (hp 1), hairpin 2 (hp 2), and the hydrophobic core (he) of the protein 
(see text) . 
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FBP WW domain 



PIN WW domain 




Figure 1: Native structures of the FBP [35] and the PIN WW domain [36]. The 
structural representations have been generated with the programs VMD [75] and 
RasterSD [76]. 
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Figure 2: Simple energy landscape of the four-state model for WW domains. The 
four states are the denatured state D, the native state N , and two transition-state 
conformations hp 1 and hp 2 in which one of the two hairpins is formed. Here, 
Gat is the free-energy difference between the native state N and the denatured 
state D, which has the 'reference free energy' = 0, and Gi and G2 are the free 
energy differences between the transition-state conformations and the denatured 
state. 
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Figure 3: Contact matrices of the FBP and PIN WW domains. A black dot at 
position of a matrix indicates that the residues i and j are in contact. Two 
residues are defined here to be in contact if the distance between any of their non- 
hydrogen atoms is smaller than the cutoff distance 4 A. The hairpins 1 and 2 of 
the WW domains correspond to clusters of contacts. 
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Figure 4: $-values for mutations that only affect haipin f of the FBP WW domain 
(see also Table 1). Except for one outlier (open circle for mutation T9A), the $- 
values are centered around the mean value 0.81 ± 0.06, with deviations mostly 
within the estimated experimental errors [33]. 
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Figure 5: Experimental $-values for the FBP WW domain versus theoretical $- 
values obtained from a least-square fit of eq. ^ with the single fit parameter xi- 
From this fit, we obtain the values xi = 0.77 ± 0.05 and X2 = 1 — Xi = 0.23 ± 0.05 
for the fractions of the two transition-state conformations in which either hairpin 
1 or hairpin 2 are formed. The Pearson correlation coefficient between theoretical 
and experimental $-values is r = 0.90 if the outlier data point for mutation T9A 
(open circle) is not considered, and r = 0.77 if the outlier is included. 
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Figure 6: Experimental ^-values for the PIN WW domain versus theoretical $- 
values obtained from a least-square fit of eq. ([3]), which results in the values Xi = 
0.67 ± 0.05 and X2 = 1 — Xi = 0-33 ± 0.05 for the fractions of the two transition- 
state conformations. The Pearson correlation coefficient between theoretical and 
experimental <l>-values is r = 0.85. 
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